Picture for Yifu Ding

Yifu Ding

SPA-Cache: Singular Proxies for Adaptive Caching in Diffusion Language Models

Add code
Jan 30, 2026
Viaarxiv icon

MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping

Add code
Nov 19, 2025
Figure 1 for MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping
Figure 2 for MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping
Figure 3 for MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping
Figure 4 for MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping
Viaarxiv icon

VORTA: Efficient Video Diffusion via Routing Sparse Attention

Add code
May 24, 2025
Viaarxiv icon

QVGen: Pushing the Limit of Quantized Video Generative Models

Add code
May 16, 2025
Viaarxiv icon

Dynamic Parallel Tree Search for Efficient LLM Reasoning

Add code
Feb 22, 2025
Viaarxiv icon

LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment

Add code
Oct 28, 2024
Figure 1 for LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment
Figure 2 for LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment
Figure 3 for LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment
Figure 4 for LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment
Viaarxiv icon

A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms

Add code
Sep 25, 2024
Figure 1 for A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms
Figure 2 for A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms
Figure 3 for A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms
Figure 4 for A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms
Viaarxiv icon

PTQ4SAM: Post-Training Quantization for Segment Anything

Add code
May 06, 2024
Figure 1 for PTQ4SAM: Post-Training Quantization for Segment Anything
Figure 2 for PTQ4SAM: Post-Training Quantization for Segment Anything
Figure 3 for PTQ4SAM: Post-Training Quantization for Segment Anything
Figure 4 for PTQ4SAM: Post-Training Quantization for Segment Anything
Viaarxiv icon

DB-LLM: Accurate Dual-Binarization for Efficient LLMs

Add code
Feb 19, 2024
Figure 1 for DB-LLM: Accurate Dual-Binarization for Efficient LLMs
Figure 2 for DB-LLM: Accurate Dual-Binarization for Efficient LLMs
Figure 3 for DB-LLM: Accurate Dual-Binarization for Efficient LLMs
Figure 4 for DB-LLM: Accurate Dual-Binarization for Efficient LLMs
Viaarxiv icon

OHQ: On-chip Hardware-aware Quantization

Add code
Sep 05, 2023
Figure 1 for OHQ: On-chip Hardware-aware Quantization
Figure 2 for OHQ: On-chip Hardware-aware Quantization
Figure 3 for OHQ: On-chip Hardware-aware Quantization
Figure 4 for OHQ: On-chip Hardware-aware Quantization
Viaarxiv icon